Skip to main content

Merge job data into individual level datasets - categorical values

The script below shows how to work with job data from the A scheme. This was reviewed in our theme course, which was run twice in 2022.

This concrete example demonstrates how to link data with employment as entity type together with another dataset with person as entity type. This is useful when you want to create job statistics on an individual level.

Data from the so-called A scheme with the prefix ARBLONN_ARB_ has employment relationship as unit type (i.e. up to several records per person). In order to be able to connect such data together with normal personal datasets, one must first use the collapse command to sum up the information relating to each individual employment relationship per individual. The dataset is then transformed into person as the unit type (one record per person), and can thus be linked together with the personal dataset.

The variable ARBEIDSFORHOLD_PERSON is used to link/aggregate from job to person level (the variable contains the person identifier associated with the relevant employment/job).

The procedure below fits well with categorical job information such as e.g. full-time/part-time. If you want to connect numerical job information such as working hours and percentage of employment, the solution in this example is recommended: Merge job data into individual level datasets - numerical values.

As the script below shows, it is possible to combine categorical and numerical information when using this current procedure.

 require no.ssb.fdb:30 as db

create-dataset vestland
import db/ARBLONN_PERS_KOMMNR 2021-07-31 as residence
keep if substr(residence,1,2) == '46'
import db/ARBLONN_PERS_KJOENN 2021-07-16 as gender

create-dataset fulltime
import db/ARBLONN_ARB_H3LDELTID 2021-07-16 as fullpart
import db/ARBLONN_ARB_ARBEIDSTID 2021-07-16 as worktime
import db/ARBEIDSFORHOLD_PERSON as personid
tabulate fullpart
keep if fullpart == '1'
destring fullpart
collapse(sum) fullpart worktime, by(personid)
rename fullpart full
rename worktime worktime_full
merge full worktime_full into vestland

create-dataset parttime
import db/ARBLONN_ARB_H3LDELTID 2021-07-16 as fullpart
import db/ARBLONN_ARB_ARBEIDSTID 2021-07-16 as worktime
import db/ARBEIDSFORHOLD_PERSON as personid
keep if fullpart == '2'
destring fullpart
replace fullpart = fullpart/2
collapse(sum) fullpart worktime, by(personid)
rename fullpart part
rename worktime worktime_part
merge part worktime_part into vestland

use vestland
tabulate full
tabulate part
tabulate part gender, rowpct freq
tabulate part, summarize(worktime_part)
tabulate gender, summarize(worktime_part)
tabulate gender, summarize(worktime_full)